CyberinfraV1, Main, Exploration, bibRecord, 000767

Performance, optimization, and fitness: Connecting applications to architectures

Identifieur interne : 000767 ( Main/Exploration ); précédent : 000766; suivant : 000768

Performance, optimization, and fitness: Connecting applications to architectures

Auteurs : Mohammad A. Bhuiyan [États-Unis] ; Melissa C. Smith [États-Unis] ; Vivek K. Pallipuram [États-Unis]

Source :

Concurrency and Computation: Practice and Experience [ 1532-0626 ] ; 2011-07.

RBID : ISTEX:E43C7B0758A3069211F596C690F59417BBC0471F

English descriptors

KwdEn :
- Fitness model, GPU, multicore, optimization, performance.

Abstract

Recent trends involving multicore processors and graphical processing units (GPUs) focus on exploiting task‐ and thread‐level parallelism. In this paper, we have analyzed various aspects of the performance of these architectures including NVIDIA GPUs, and multicore processors such as Intel Xeon, AMD Opteron, IBM's Cell Broadband Engine. The case study used in this paper is a biological spiking neural network (SNN), implemented with the Izhikevich, Wilson, Morris–Lecar, and Hodgkin–Huxley neuron models. The four SNN models have varying requirements for communication and computation making them useful for performance analysis of the hardware platforms. We report and analyze the variation of performance with network (problem size) scaling, available optimization techniques and execution configuration. A Fitness performance model, that predicts the suitability of the architecture for accelerating an application, is proposed and verified with the SNN implementation results. The Roofline model, another existing performance model, has also been utilized to determine the hardware bottleneck(s) and attainable peak performance of the architectures. Significant speedups for the four SNN neuron models utilizing these architectures are reported; the maximum speedup of 574x was observed in our GPU implementation. Our results and analysis show that a proper match of architecture with algorithm complexity provides the best performance. Copyright © 2010 John Wiley & Sons, Ltd.

Url:

https://api.istex.fr/document/E43C7B0758A3069211F596C690F59417BBC0471F/fulltext/pdf

DOI: 10.1002/cpe.1688

Affiliations:

Links toward previous steps (curation, corpus...)

to stream Istex, to step Corpus: 000786
to stream Istex, to step Curation: 000786
to stream Istex, to step Checkpoint: 000187
to stream Main, to step Merge: 000769
to stream Main, to step Curation: 000767

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Performance, optimization, and fitness: Connecting applications to architectures</title>
<author><name sortKey="Bhuiyan, Mohammad A" sort="Bhuiyan, Mohammad A" uniqKey="Bhuiyan M" first="Mohammad A." last="Bhuiyan">Mohammad A. Bhuiyan</name>
</author>
<author><name sortKey="Smith, Melissa C" sort="Smith, Melissa C" uniqKey="Smith M" first="Melissa C." last="Smith">Melissa C. Smith</name>
</author>
<author><name sortKey="Pallipuram, Vivek K" sort="Pallipuram, Vivek K" uniqKey="Pallipuram V" first="Vivek K." last="Pallipuram">Vivek K. Pallipuram</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:E43C7B0758A3069211F596C690F59417BBC0471F</idno>
<date when="2011" year="2011">2011</date>
<idno type="doi">10.1002/cpe.1688</idno>
<idno type="url">https://api.istex.fr/document/E43C7B0758A3069211F596C690F59417BBC0471F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000786</idno>
<idno type="wicri:Area/Istex/Curation">000786</idno>
<idno type="wicri:Area/Istex/Checkpoint">000187</idno>
<idno type="wicri:doubleKey">1532-0626:2011:Bhuiyan M:performance:optimization:and</idno>
<idno type="wicri:Area/Main/Merge">000769</idno>
<idno type="wicri:Area/Main/Curation">000767</idno>
<idno type="wicri:Area/Main/Exploration">000767</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Performance, optimization, and fitness: Connecting applications to architectures</title>
<author><name sortKey="Bhuiyan, Mohammad A" sort="Bhuiyan, Mohammad A" uniqKey="Bhuiyan M" first="Mohammad A." last="Bhuiyan">Mohammad A. Bhuiyan</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634</wicri:regionArea>
<placeName><region type="state">Caroline du Sud</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Smith, Melissa C" sort="Smith, Melissa C" uniqKey="Smith M" first="Melissa C." last="Smith">Melissa C. Smith</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634</wicri:regionArea>
<placeName><region type="state">Caroline du Sud</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Pallipuram, Vivek K" sort="Pallipuram, Vivek K" uniqKey="Pallipuram V" first="Vivek K." last="Pallipuram">Vivek K. Pallipuram</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634</wicri:regionArea>
<placeName><region type="state">Caroline du Sud</region>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Concurrency and Computation: Practice and Experience</title>
<title level="j" type="abbrev">Concurrency Computat.: Pract. Exper.</title>
<idno type="ISSN">1532-0626</idno>
<idno type="eISSN">1532-0634</idno>
<imprint><publisher>John Wiley & Sons, Ltd.</publisher>
<pubPlace>Chichester, UK</pubPlace>
<date type="published" when="2011-07">2011-07</date>
<biblScope unit="volume">23</biblScope>
<biblScope unit="issue">10</biblScope>
<biblScope unit="page" from="1066">1066</biblScope>
<biblScope unit="page" to="1100">1100</biblScope>
</imprint>
<idno type="ISSN">1532-0626</idno>
</series>
<idno type="istex">E43C7B0758A3069211F596C690F59417BBC0471F</idno>
<idno type="DOI">10.1002/cpe.1688</idno>
<idno type="ArticleID">CPE1688</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">1532-0626</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Fitness model</term>
<term>GPU</term>
<term>multicore</term>
<term>optimization</term>
<term>performance</term>
</keywords>
</textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Recent trends involving multicore processors and graphical processing units (GPUs) focus on exploiting task‐ and thread‐level parallelism. In this paper, we have analyzed various aspects of the performance of these architectures including NVIDIA GPUs, and multicore processors such as Intel Xeon, AMD Opteron, IBM's Cell Broadband Engine. The case study used in this paper is a biological spiking neural network (SNN), implemented with the Izhikevich, Wilson, Morris–Lecar, and Hodgkin–Huxley neuron models. The four SNN models have varying requirements for communication and computation making them useful for performance analysis of the hardware platforms. We report and analyze the variation of performance with network (problem size) scaling, available optimization techniques and execution configuration. A Fitness performance model, that predicts the suitability of the architecture for accelerating an application, is proposed and verified with the SNN implementation results. The Roofline model, another existing performance model, has also been utilized to determine the hardware bottleneck(s) and attainable peak performance of the architectures. Significant speedups for the four SNN neuron models utilizing these architectures are reported; the maximum speedup of 574x was observed in our GPU implementation. Our results and analysis show that a proper match of architecture with algorithm complexity provides the best performance. Copyright © 2010 John Wiley & Sons, Ltd.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Caroline du Sud</li>
</region>
</list>
<tree><country name="États-Unis"><region name="Caroline du Sud"><name sortKey="Bhuiyan, Mohammad A" sort="Bhuiyan, Mohammad A" uniqKey="Bhuiyan M" first="Mohammad A." last="Bhuiyan">Mohammad A. Bhuiyan</name>
</region>
<name sortKey="Pallipuram, Vivek K" sort="Pallipuram, Vivek K" uniqKey="Pallipuram V" first="Vivek K." last="Pallipuram">Vivek K. Pallipuram</name>
<name sortKey="Smith, Melissa C" sort="Smith, Melissa C" uniqKey="Smith M" first="Melissa C." last="Smith">Melissa C. Smith</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000767 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000767 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:E43C7B0758A3069211F596C690F59417BBC0471F
   |texte=   Performance, optimization, and fitness: Connecting applications to architectures
}}

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024

	Serveur d'exploration Cyberinfrastructure
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration Cyberinfrastructure

Performance, optimization, and fitness: Connecting applications to architectures

Performance, optimization, and fitness: Connecting applications to architectures

Source :

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri